Project-Team:IPSO

Project-Team Ipso

Members

Overall Objectives

Research Program

Application Domains

New Results

Multi-revolution composition methods for highly oscillatory differential equations
Weak second order multi-revolution composition methods for highly oscillatory stochastic differential equations with additive or multiplicative noise
High order numerical approximation of the invariant measure of ergodic SDEs
PIROCK: a swiss-knife partitioned implicit-explicit orthogonal Runge-Kutta Chebyshev integrator for stiff diffusion-advection-reaction problems with or without noise
An offline-online homogenization strategy to solve quasilinear two-scale problems at the cost of one-scale problems
Reduced basis finite element heterogeneous multiscale method for quasilinear elliptic homogenization problems
Weak second order explicit stabilized methods for stiff stochastic differential equations
Mean-square A-stable diagonally drift-implicit integrators of weak second order for stiff Itô stochastic differential equations
Two-Scale Macro-Micro decomposition of the Vlasov equation with a strong magnetic field
A dynamic multi-scale model for transient radiative transfer calculations
Quasi-periodic solutions of the 2D Euler equation
Optimization and parallelization of Emedge3D on shared memory architecture
Vlasov on GPU (VOG Project)
Uniformly accurate numerical schemes for highly oscillatory Klein-Gordon and nonlinear Schrödinger equations
Asymptotic preserving schemes for the Wigner-Poisson-BGK equations in the diffusion limit
Existence and stability of solitons for fully discrete approximations of the nonlinear Schrödinger equation
Asymptotic preserving schemes for the Klein-Gordon equation in the non-relativistic limit regime
Sobolev stability of plane wave solutions to the cubic nonlinear Schrödinger equation on a torus
Weak backward error analysis for overdamped Langevin equation
Weak backward error analysis for Langevin equation
Approximation of the invariant law of SPDEs: error analysis using a Poisson equation for a full-discretization scheme
An asymptotic preserving scheme based on a new formulation for NLS in the semiclassical limit
Asymptotic Preserving schemes for highly oscillatory Vlasov-Poisson equations
Uniformly accurate numerical schemes for highly oscillatory Klein-Gordon and nonlinear Schrödinger equations
On the controllability of quantum transport in an electronic nanostructure
The Interaction Picture method for solving the generalized nonlinear Schrödinger equation in optics
Solving highly-oscillatory NLS with SAM: numerical efficiency and geometric properties
Analysis of models for quantum transport of electrons in graphene layers
Analysis of a large number of Markov chains competing for transitions
Markov Chains Competing for Transitions: Application to Large-Scale Distributed Systems
Existence of densities for the 3D Navier–Stokes equations driven by Gaussian noise
Invariant measure of scalar first-order conservation laws with stochastic forcing
Degenerate Parabolic Stochastic Partial Differential Equations: Quasilinear case
Existence of densities for stable-like driven SDE's with Hölder continuous coefficients
Ergodicity results for the stochastic Navier-Stokes equations: an introduction
Weak truncation error estimates for elliptic PDEs with lognormal coefficients
Optimized high-order splitting methods for some classes of parabolic equations
Higher-Order Averaging, Formal Series and Numerical Integration III: Error Bounds

Partnerships and Cooperations

Dissemination

Bibliography

Inria | Raweb 2013 | Presentation of the Project-Team IPSO


	PDF	e-Pub

previous

Home | Next next

next

Section: New Results

Optimization and parallelization of Emedge3D on shared memory architecture

In [38] , a study of techniques used to speedup a scientific simulation code is presented. The techniques include sequential optimizations as well as the parallelization with OpenMP. This work is carried out on two different multicore shared memory architectures, namely a cutting edge 8x8 core CPU and a more common 2x6 core board. Our target application is representative of many memory bound codes, and the techniques we present show how to overcome the burden of the memory bandwidth limit, which is quickly reached on multi-core or many-core with shared memory architectures. To achieve efficient speedups, strategies are applied to lower the computation costs, and to maximize the use of processors caches. Optimizations are: minimizing memory accesses, simplifying and reordering computations, and tiling loops. On 12 cores processor Intel X5675, aggregation of these optimizations results in an execution time 21.6 faster, compared to the original version on one core.

previous

Home | Next next

next